Model Selection

Real-time Transcription

# Real-time Transcription

Faster Distil Whisper Large V3.5

Distil-Whisper is a distilled version of the Whisper model, optimized for Automatic Speech Recognition (ASR) tasks, offering faster inference speeds.

Speech Recognition English

Faster Distil Whisper Large V3.5

A CTranslate2 format model converted from Distil-Whisper large-v3.5 for efficient speech recognition

Speech Recognition English

Whisper Large V3 Turbo Gguf

Whisper large-v3-turbo is a pruned and fine-tuned version based on Whisper large-v3, with the decoder layers reduced from 32 to 4, significantly improving speed while slightly reducing quality.

Speech Recognition Supports Multiple Languages

Distil Large V3.5 Ct2

Distil-Whisper is a distilled version of the Whisper model, achieving efficient speech recognition through large-scale pseudo-labeling technology

Speech Recognition English

Lite Whisper Large V3 Turbo Acc

Lite-Whisper is a lightweight version of OpenAI Whisper compressed using LiteASR technology, maintaining high accuracy while reducing model size.

Speech Recognition

efficient-speech

Moonshine Base ONNX

ONNX-format automatic speech recognition model based on the Moonshine base model, supporting efficient inference

Speech Recognition

Moonshine Tiny ONNX

Moonshine Tiny is a lightweight automatic speech recognition (ASR) model suitable for embedded devices and edge computing scenarios.

Speech Recognition

Moonshine is a series of automatic speech recognition (ASR) models developed by Useful Sensors, specifically designed for English speech transcription, excelling on resource-constrained platforms.

Speech Recognition

Transformers English

Whisper Tiny Chinese

A speech recognition model fine-tuned on the Common Voice 11.0 Chinese dataset based on OpenAI Whisper Tiny model

Speech Recognition

Transformers Chinese

Whisper Base.en

Whisper is a general-purpose speech recognition model trained by OpenAI. This model is based on large-scale weakly supervised training and supports speech transcription in multiple languages.

Speech Recognition

Whisper is an automatic speech recognition (ASR) system trained by OpenAI, supporting multilingual speech transcription.

Speech Recognition

Faster Distil Whisper Large V3

Distilled version of Whisper Large v3 for efficient automatic speech recognition (ASR)

Speech Recognition English

Nue ASR is an end-to-end Japanese speech recognition model that integrates pre-trained speech and language models, offering high accuracy and fast recognition speed.

Speech Recognition

Transformers Supports Multiple Languages

Distil Medium.en

Distil-Whisper is a distilled version of the Whisper model, 6 times faster than the original, with a 49% reduction in size, while maintaining performance close to the original in English speech recognition tasks.

Speech Recognition English

Whisper Small Ml

This model is a fine-tuned version of openai/whisper-small for speech recognition, supporting multiple languages and suitable for automatic speech recognition tasks.

Speech Recognition

Whisper Small Turkish Tr Best

Turkish speech recognition model fine-tuned based on OpenAI Whisper-small, with a word error rate of 26.34%

Speech Recognition

Whisper Medium is a medium-scale speech recognition model developed by OpenAI, supporting automatic speech recognition (ASR) tasks in multiple languages.

Speech Recognition

Whisper Small is a small automatic speech recognition (ASR) model developed by OpenAI, capable of converting speech into text.

Speech Recognition

Whisper is an automatic speech recognition (ASR) system trained by OpenAI, supporting speech-to-text tasks in multiple languages.

Speech Recognition

Faster Whisper Small

Transformer-based automatic speech recognition (ASR) model supporting multilingual transcription

Speech Recognition Supports Multiple Languages

Wav2vec2 Live Japanese

A Japanese speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, supporting hiragana output

Speech Recognition

Transformers Japanese

Waynehills STT Doogie Server

A fine-tuned speech recognition model based on Doogie/Waynehills-STT-doogie-server

Speech Recognition

Distil Wav2vec2

Distil-wav2vec2 is a distilled version of the wav2vec2 model, with a 45% reduction in size and a two-fold increase in inference speed, suitable for automatic speech recognition tasks.

Speech Recognition

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase